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Abstract 

Purpose: Retinal dystrophies are genetically heterogeneous, resulting from mutations in over 200 genes. Prior to the 
development of massively parallel sequencing, comprehensive genetic screening was unobtainable for most patients. 
Identifying the causative genetic mutation facilitates genetic counselling, carrier testing and prenatal/pre-implantation 
diagnosis, and often leads to a clearer prognosis. In addition, in a proportion of cases, when the mutation is known 
treatment can be optimised and patients are eligible for enrolment into clinical trials for gene-specific therapies. 

Methods: Patient genomic DNA was sheared, tagged and pooled in batches of four samples, prior to targeted capture and 
next generation sequencing. The enrichment reagent was designed against genes listed on the RetNet database (July 2010). 
Sequence data were aligned to the human genome and variants were filtered to identify potential pathogenic mutations. 
These were confirmed by Sanger sequencing. 

Results: Molecular analysis of 20 DNAs from retinal dystrophy patients identified likely pathogenic mutations in 12 cases, 
many of them known and/or confirmed by segregation. These included previously described mutations in ABCA4 (c.6088C> 
T,p.R2030*; c.5882G>A,p.G1 961 E), BBS2 (c.1895G>C,p.R632P), GUCY2D (c.2512C>T,p.R838C), PROM1 (c.1 1 1 7C>T,p.R373C), 
RDH12 (c.601T>C,p.C201R; c.506G>A,p.R169Q), RPGRIP1 (c.3565C>T,p.R1 1 89*) and SPATA7 (c.253C>T,p.R85*) and new 
mutations in ABCA4 (c.3328+1G>C), CRB1 (c.2832_2842+23del), RP2 (c.884-1G>T) and USH2A (c.12874A>G,p.N4292D). 

Conclusions: Tagging and pooling DNA prior to targeted capture of known retinal dystrophy genes identified mutations in 
60% of cases. This relatively high success rate may reflect enrichment for consanguineous cases in the local Yorkshire 
population, and the use of multiplex families. Nevertheless this is a promising high throughput approach to retinal 
dystrophy diagnostics. 
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Introduction 

Retinal dystrophies are to date the most genetically heteroge- 
neous set of inherited conditions known to affect a single organ. 
This complicates genetic screening for conditions such as retinitis 
pigmentosa (RP), cone-rod dystrophy (CRD) and Leber congenital 
Amaurosis (LCA) since each can result from mutations in many 
genes (see RetNet, https://sph.uth.tmc.edu/retnet/) which, with 
the exception of LCA, follow dominant, recessive or X-linked 
patterns of inheritance. Nationally, inherited retinal disease 
accounts for 4.2% of all sight impairment certifications and 



5.5% of blindness cases [1]. These diseases are a more significant 
issue in the West Yorkshire population due to the high incidence of 
first cousin marriage and consequent recessive disease in the local 
Pakistani community [2]. Until recendy, patients could at best be 
offered only limited counselling based on approximate recurrence 
rates for a given mode of inheritance, whilst presymptomatic 
diagnosis and carrier status testing were impossible in all but a 
minority of cases. A further incentive for seeking to improve this 
situation is the notable success of an increasing number of clinical 
trials for gene and other targeted therapies for retinal dystrophies 
[3-7]. These are gene-specific, meaning that only patients for 
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whom mutations have been identified will benefit from these novel 
approaches to stratified medicine. 

In order to increase patient recruitment to new gene- or 
mutation-specific trials, several groups have already highlighted 
the potential of next generation sequencing in disease diagnosis 
[8-14]. Here we confirm the efficacy of this approach in a 
Northern UK cohort. In addition we describe the use of a 
previously published approach, tagging and DNA pooling prior to 
targeted capture and next generation sequencing [15], providing a 
valuable refinement to existing high throughput next generation 
sequencing strategies for identifying the genetic basis of retinal 
dystrophy. 

Materials and Methods 

Ethics Statement 

Patients and their relatives recruited to the study gave informed, 
written consent using a process approved by the Leeds East 
Research Ethics committee (Project number 03/362), adhering to 
the tenets of the Declaration of Helsinki. 



Samples 

The families were selected on the basis that there were multiple 
affected members with an unidentified molecular genetic diagno- 
sis. The patients were diagnosed with a retinal dystrophy by an 
experienced ophthalmologist. Pedigree structures are depicted in 
Figure 1, while diagnoses, possible inheritance patterns, ethnicity 
and summary information regarding numbers of affected cases 
and members who were available for sampling are recorded in 
Table S 1 in File S 1 . Peripheral blood was collected from affected 
patients, their parents and unaffected relatives where available. 
Genomic DNA was extracted from blood according to standard 
procedures. 

Target design 

In order to enrich specific regions of the patient's genomic 
DNA, a liquid-phase reagent comprising 'SureSelect Target 
Enrichment' biotinylated cRNA baits was designed using the 
Agilent Technologies eArray software (http://www.genomics. 
agilent.com/) (Agilent Technologies UK Limited, Wokingham, 
UK). In total, 2,988 coding exons as well as a single intronic 
region, and their 100 bp flanking sequences, were selected in the 
UCSC genome database (http://www.genome.ucsc.edu/) from all 
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Figure 1. Family pedigrees of patients that were studied. Individuals from whom DNA was available are assigned the DNA notation in small 
lettering to the top right hand side of the symbol (and are also numbered). * highlights pedigrees that have been abbreviated for this figure. 
doi:1 0.1 371/journal.pone.01 04281 .g001 
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of the 162 genes implicated in retinal degeneration (RetNet, July 
2010). The list of genes targeted is shown in Table S2 in File SI. 
This consisted of 46,287 RNA baits at 5 x tiling to cover 776.5 kb 
of DNA sequence. Probes could not be designed against 9 exons 
(Table S3 in File SI). 

Library construction and massively parallel sequencing 

Genomic DNA was sheared using a Covaris S220 sonicator. 
Illumina sequencing adapters containing 6 bp sequence tags were 
ligated to the samples, with each DNA sample being ligated to a 
different tag. The tagged DNA libraries were pooled into batches 
and captured using the SureSelect custom baits according to the 
manufacturer's instructions. Each captured pool was sequenced 
using single-end 80 bp reads on an Illumina GAIIx Sequencer 
(Illumina Inc., Little Chesterford, UK) according to the manufac- 
turer's instructions. 

Alignments and variant detection 

Sequence data were generated in qseq format and barcode 
sorted by their unique 5' tag using NovoSort. The sorted fastq files 
have been deposited in the European Nucleotide Archive (http:// 
www.ebi.ac.uk/ena/) with study accession number, PRJEB6380. 
The reads were aligned to the human genome sequence, hgl9, 
using Novoalign (v2.08.01). Following realignment around indels, 
the GATK (v2.0.34) Unified Genotyper was used to identify 
variants [16]. The output VCF files were annotated for analysis 
using Alamut-HT (vl.0.4) (Interactive Biosoftware, Rouen, 



France). Analysis of read depth was performed using BEDTools 
(v2.15.0) and the GATK Count Reads walker. 

Variants were filtered to exclude those more than 5 bp beyond 
the splice site junction. Synonymous variants and those with minor 
allele frequencies SO. 01 in dbSNP or the 1,000 genomes project 
were also excluded. 

From the remaining list, variants were then selected for further 
analysis if they met one or both of the following criteria. Firstly, 
variants that occurred in genes that had previously been associated 
with the observed phenotype and showed the expected pattern of 
inheritance were selected. Secondly, null alleles resulting from 
nucleotide deletions or insertions, premature stop codon mutations 
or changes affecting the conserved 2 bp adjacent to the splice site 
junction as well as missense variants with at least 2 out of 4 high 
pathogenicity scores were selected. For a high pathogenicity 
profile, scores recorded in the Alamut-HT report included 
BLOSUM62 (Blocks Substitution Matrix; http://www.uky.edu/ 
Classes/BIO/520/BIO520WWW/blosum62.htm) <0, AGVGD 
(Align Grantham Variation and Grantham Deviation; http:// 
agvgd.iarc.fr/agvgd_input.php) between C15 and C65, SIFT 
(Sorts Intolerant From Tolerant substitutions, http://sift.jcvi.org) 
<0.05 or deleterious and MAPP (Multivariate Analysis of 
Protein Polymorphism; http://mendel.stanford.edu/SidowLab/ 
downloads/MAPP) = bad. A schematic for the sequencing and 
informatics pipeline is shown in Figure 2. For any cases with a 
diagnosis of LC A, the unfiltered variant lists were also analysed for 
the deep intronic mutation c.2991 + 1655A>G in CEP290 that 
causes this phenotype [17]. 



Schematic for next generation sequencing and variant detection. 
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Figure 2. Schematic for next generation sequencing and 
variant detection. The strategy for NGS library preparation (A) and 
informatics used (B) are depicted. 
doi:1 0.1 371 /journal.pone.01 04281 .g002 



Sanger sequencing of potential disease-causing variants 

Variants selected by the above criteria were confirmed by 
conventional Sanger sequencing of patient genomic DNA using 
the BigDye terminator cycle sequencing kit (Applied Biosystems, 
Paisley, UK) on an ABI3 130x1 sequencer (Applied Biosystems) and 
analysed using Sequencing Analysis v. 5. 2 software (Applied 
Biosystems). This was used to confirm presence of the mutation 
and test whether the mutation segregated with the disease 
phenotype in the family in question. 

Confirmed pathogenic mutations were deposited in the publicly 
available LOVD database (http://databases.lovd.nl/shared/). 

Results 

Validating the capture reagent and establishing a 
pipeline for variant detection 

To test the feasibility of identifying pathogenic mutations in 
genomic DNA from patients with retinal degeneration, we selected 
four patients in whom, by Sanger sequencing of candidate genes, 
we had identified mutation(s) deemed clearly causative based on 
exclusion from control cohorts, predicted pathogenicity and 
segregation in additional family members. The analysis of the 
data for this study was conducted by one of the co-authors (David 
A Parry) without prior knowledge of these known mutations in the 
samples. Briefly, a sequencing adapter containing a different 6 bp 
sequence tag was ligated to each patient's sonicated DNA. The 
tagged aliquots were pooled prior to hybridisation against the 
target enrichment reagent and run on a single lane of the Illumina 
GAIIx DNA sequencer. The sequence data for each sample was 
sorted by sequence tag and aligned against the human reference 
sequence for analysis of coverage and read depth (Table 1). 
Pooling of 4 samples gave a range of coverage between 95.6% to 
96.9% with at least 20 good quality reads following duplicate 
removal and between 1 and 2 % that had less than 5 x read depth. 
A list of variants was generated for each sample and these were 
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filtered without family history information according to the criteria 
highlighted in Table 2, and described in the Methods section, to 
give rise to a list of candidate variants for each sample (Table S4 in 
File SI). 

Prioritisation of the variants was based on whether the genotype 
was consistent with disease symptoms in the family, the variant 
type and pathogenicity scores. For sample A with a diagnosis of 
RP, heterozygous mutations in RP9, RP1 and FSCN2 were 
deemed consistent with disease symptoms, and of these a high 
pathogenicity profile suggested that the strongest candidate for 
causation in sample A was the RP9 variant. For sample B, though 
a number of changes were observed, only compound heterozy- 
gosity for a premature stop codon and a high pathogenicity 
missense mutation in CRB1 fitted with the LCA diagnosis in this 
patient. For sample C, heterozygous variants in RP1 and a 
homozygous variant in USH2A were considered possible candi- 
dates for causing RP in this patient. However based on 
pathogenicity scores and variant type, the strongest candidates 
for disease causation in sample C were the RP1 variants. For 
sample D, only a null mutation in PRPF31 was identified as 
consistent with the diagnosis of RP. 

The variants that had previously been deemed causative in each 
sample are shown in Table 3. As these variants had indeed been 
implicated as candidates for pathogenicity following filtering and 
prioritisation as highlighted above, without the need for segrega- 
tion analysis, this confirmed that the pipeline used to identify 
pathogenic mutations was robust. 

Screening patients with unknown mutations 

We then selected 20 patients with various retinal degenerations 
for which no mutation had yet been identified and performed the 
pre-capture pooling procedure on the tagged DNA libraries 
pooled in batches of four samples. Following alignment, variant 
detection and filtering as described in the Methods, a list of 
candidate variants were identified for each sample (Table S5 in 
File SI). Candidate variants were prioritised as described 
previously and Sanger sequenced to confirm the presence of the 
mutation. Segregation was performed where DNA from other 
family members was available. 

For MAI, family history suggested LCA with recessive 
inheritance caused by an autozygous mutation. The variant list 
following analysis of patient 2906 (a female) suggested the 
homozygous CRB1 mutation (c.2832_2842+23del) as the only 
candidate consistent with the diagnosis in the family [18]. Analysis 
of the other affected case from whom DNA was available (2907) 
confirmed the CRB1 mutation as the pathogenic cause of disease. 

For MA2, family history of the index case (2844, a male) with 
unaffected parents and consanguinity suggested recessive inheri- 
tance caused by an autozygous mutation. The variant list following 
analysis of this case suggested a previously-identified homozygous 
nonsense mutation in ABCA4 (c.6088C>T, p.R2030*) [19] 
consistent with a diagnosis of CRD as the primary candidate. 
This mutation was indeed confirmed in the index case and 
subsequently found to be heterozygous in his affected offspring 
(2843 and 2845) suggesting that they both had an unidentified 
ABCA4 mutation on their other allele which they had inherited 
from their mother. 

For MA3, family history suggested RP with recessive inheri- 
tance due to an autozygous mutation. The variant list following 
analysis of patient 2908 (a female) identified a homozygous 
missense variant in USH2A (c.12874A>G, p.N4292D) with a 
high pathogenicity profile as the sole candidate. The USH2A 
mutation was indeed subsequently confirmed in both affected 
cases from whom DNA was available. 
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Table 2. Filtering the variant lists following targeted capture and next generation sequencing for the 4 patient verification study. 



Filtering process 


Patient A 


Patient B 


Patient C 


Patient D 


Total variants identified 


614 


564 


595 


580 


Exclude outside exon/splice junction 


278 


282 


269 


260 


Exclude synonymous variants 


134 


142 


131 


124 


Exclude if MAF >0.01 


7 


12 


10 


3 


Exon constitutes coding variants only. Splice junction constitutes +/- 


-5 bp around an exon 


A full list of variants 


is shown in Table S4. 
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For MA4, family history suggested recessive inheritance of RP 
and an autozygous mutation. The variant list following analysis of 
case 2833 (a male) highlighted two homozygous missense variants 
in EYS as possible candidates. Following analysis of the other 
affected case (2910), both EYS variants were homozygous and 
Sanger sequencing of the EYS terminal exon that was not covered 
by the capture reagent failed to identify any other changes. One of 
the EYS variants (c.7558T>C, p.F2520L) disrupts the second 
laminin G subdomain which is essential for normal protein 
function [20] . Given the degree of co-segregation and consistency 
with phenotype, this was considered the most likely variant to be 
pathogenic, but given the low pathogenicity profile scores due to 
the lack of amino acid conservation of the normal residue in 
vertebrates (data not shown), the variant was considered unproven. 

For MA5 family history suggests dominant inheritance of a 
CRD phenotype. The variant list following analysis of patient 
2278 (a female) did not highlight any obvious candidates. 

For MA6, family history suggested recessive inheritance of RP 
with an autozygous mutation. The variant list described a 
previously identified homozygous missense mutation in RDH12 
(c.601T>C, p.C201R) [21] with a high pathogenicity profile 
which was confirmed in the case (a male) as the likely cause of 
disease. 

For MA7, family history suggested dominant inheritance of 
CRD. The variant list following analysis of patient 1 14 (a male) 
highlighted the heterozygous PROM1 mutation (c.H17C>T, 
p.R373C) which was previously identified in patients with a 
diagnosis of cone-rod dystrophy [22,23] as the possible cause of 
disease symptoms. This was confirmed by segregation in the 
family. 

For MA8, family history suggested dominant or X-linked 
inheritance of RP with macular involvement. The variant list 
derived from analysing case 40 (a male) described a dominant 
variant in NR2E3 and an X-linked variant in RP2 as the most 
likely candidates. Analysis of the variants in additional family 
members for segregation identified that only the splicing variant in 
RP2 (c.884-lG>T) followed disease symptoms as X-linked 
dominant inheritance in the family. 

For MA9, family history suggested dominant inheritance of a 
macular dystrophy phenotype. The variant list derived from 
analysing case 530 (a female) identified heterozygous variants in 
HMCN1 and the previously reported GUCY2D [24,25] as the 
most likely candidates. Analysis of additional family members from 
whom DNA was available only confirmed segregation of the 
GUCY2D mutation (c.2512C>T, p.R838C) with disease symp- 
toms in the family. 

For MA10, family history suggested recessive inheritance of 
CRD with an autozygous mutation. The variant list from 
analysing case 1857 (a male) highlighted only one candidate, a 
homozygous null variant in RPGRIP1 (c.3565C>T, p.R1189*) 
that was recently reported independently as a pathogenic cause of 



disease [26]. Segregation analysis confirmed this mutation as the 
cause of disease symptoms in this family. 

For MA11, family history suggested recessive RP with an 
autozygous mutation. The variant list derived from analysing 
patient 2093 (a male) described a homozygous missense variant in 
BBS2 (c.l895G>C, p.R632P) as the most likely candidate. 
Analysis of the other affected case 1267 confirmed that the 
BBS2 mutation, which was recently reported to be a common 
cause of RP in the Ashkenazi Jewish population [27], was the likely 
pathogenic cause of disease. 

For MAI 2, family history suggested recessive CRD. The variant 
list derived from case 1024 (a male) highlighted two heterozygous 
missense variants in CDH23 as possible candidates even though 
recessive mutations in this gene usually cause Usher syndrome. 
The absence of segregation in other family members suggested 
that these variants were not the pathogenic cause of disease in this 
family. 

For MAI 3, family history suggested recessive inheritance of RP. 
Analysis of the variant list from case 863 (a female) identified 
missense variants in GPR98 and MY07A as the best candidates 
even though mutations in these genes usually cause recessive 
Usher syndrome. On the basis of higher pathogenicity profiles, the 
GjPR98 variants were analysed further. Segregation analysis 
confirmed that these variants were not the cause of disease 
symptoms in this family. 

For MA 14, family history suggested RP with recessive 
inheritance due to an autozygous mutation in each case. The 
variant lists for patient 1518 (a male), identified two heterozygous 
variants in BBS12 and one in FSCN2 as possible candidates 
though neither option appeared to fit the observed phenotype 
perfectly. Following analysis of the other affected sibling (1527) 
these variants did not segregate with the disease phenotype and so 
were unlikely to be the pathogenic cause of disease in this family. 

For MAI 5, family history suggested recessive CRD with an 
autozygous mutation. The variant list for patient 3283 (a male) 
identified a previously been reported homozygous null variant in 
SPATA7 (c.253C>T, p.R85*) [28] as the most likely candidate. 
Analysis of DNA from other family members highlighted that this 
variant segregated with the disease phenotype as expected. 

For MA 16 with a diagnosis of LCA, family history of the index 
case (3341, a male) suggested recessive inheritance and an 
autozygous mutation. The variant list from analysing 3340 
highlighted only the previously reported LCA causing RDH12 
variant (c.506G>A, p.R169Q) [29] as the likely cause of disease. 
This mutation was confirmed in the other family member. 

For MAI 7, family history suggested recessive inheritance of 
RCD caused by an autozygous mutation. From the variant list of 
patient 3347 (a male), no obvious candidates could be identified. 

For MAI 8, family history suggested CRD with recessive 
inheritance. From analysing the variant list of case 1484 (a 
female), compound heterozygous variants in ABCA4 for the 
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previously reported missense variant (c.5882G>A, p.G1961E) 
[30,31] as well as the heterozygous splicing variant (c.3328+lG> 
C) suggested these changes as the most likely to account for the 
CRD in this family. This was confirmed by segregation analysis of 
the variants. 

For MA 19 family history suggested recessive inheritance of 
RCD with recessive inheritance with an autozygous mutation. The 
variant list of patient 1885 (a male), identified compound 
heterozygous variants in CC2D2A and PCDH15 as well as a 
variant in WFS1 with a high pathogenicity profile as possible 
candidates though none of the options appeared to fit the observed 
phenotype perfectly. Analysis of family members from whom 
DNA was available confirmed three of the putative variants were 
artefacts and the remaining ones in CC2D2A and WFS1 did not 
segregate with disease. 

For MA20, family history suggested RP with recessive 
inheritance due to an autozygous mutation. The variant list of 
case 472 (a male) identified a single homozygous missense variant 
in TRPM1 as well as compound heterozygous variants in 
CEP290 and a variant in CA4, though none of these candidates 
appeared to exactiy fit the observed phenotype. As suspected, 
these variants were either artefacts or failed to segregate with 
disease in this family suggesting that the pathogenic cause of 
g> iT disease has yet to be identified. 

Using this approach likely pathogenic mutation(s) were identi- 
fied in 12 out of 20 cases (60%). A list of these mutations is 
highlighted in Table 4 and the sequence chromatograms of each 
candidate variant highlighted in Figure SI in File SI. To 
g g, summarise, the mutations consisted of previously reported 

2 £ mutations of clinical significance in ABCA4 (c.6088C>T, 

p.R2030* [19] and c.5882G>A, p.G1961E [30,31]), RDH12 
if (c.601T>C, p.C201R [21] and c.506G>A, p.R169Q, [29]), 

& c PROM1 (c.lll7C>T, p.R373C [22,23]), GUCY2D (c.2512C> 

£1 T, p.R838C [24,25]), RPGRIP1 (c.3565C>T, p.R1189* [26]), 

^"S BBS2 (c.1895G>C, p.R632P [27]) and SPATA7 (c.253C>T, 

| I" p.R85* [28]) and new mutations in CRB1 (c.2832_2842+23del), 

| | USH2A (c.12874A>G, p.N4292D), RP2 (c.884-lG>T) and 

ABCA4 (c.3328+lG>C). Of the 8 cases for which the pathogenic 
mutation could not be identified, the absence of zero-coverage 
™ tu targeted regions suggested that a homozygous deletion removing 

| ~° an exon(s) was not the cause of disease in these patients. 

2 >■ 

■£ g Discussion 

o ai 
~ o 

£ s In this paper we describe a previously published strategy for 

o jz target capture and next generation sequencing that utilises tagging 

!d Z and pooling of DNAs in batches of four prior to enrichment [15]. 

2 o This approach refines the use of targeted capture technology, 

g- ~ facilitating the enrichment of exons from pooled samples using a 

^ single aliquot of capture reagent. This strategy differs from 

§ t previously described methods which usually pool samples after the 

| hybridization step to multiplex onto one lane of the sequencer. 

£ S g The technology described herein will contribute to the develop- 

■£ & °. ment of a retinal dystrophy diagnostic screening service by 

° G £3 reducing costs associated with using a single capture reagent to 

| j,2 analyse up to four samples in a single experiment. We also describe 

§. 5 ^ use of a reagent designed to enrich patient genomic DNA for all 

~E g> retinal dystrophy genes that were listed in Retnet as of July 2010. 

«SE A recent update in January 2014 has 66 additional genes found to 

S jz .2, have mutations causing retinal dystrophy that were not included in 

the reagent used in this study. The flexibility of our approach 
means that these genes can be incorporated into subsequent 



I! 



. e> o 



J: <j B versions of the targeted reagent. A methodological drawback of the 

targeted hybridisation approach is that regions containing repeat 
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Table 5. Comparison of the methodological approaches in recent publications that have used high throughput next generation 
sequencing for retinal disease diagnosis. 





Authors 
[Reference] 


Detecting 
phenotypes 


Library preparation 


NGS instrument 


Number of 
independent 
samples tested 


Pathogenic 
mutation 
identified (%) 






Gene number 


Method 








Bowne et al [8] 


adRP 


46 


PCR amplicons 


454GS FLX Titanium 
(Roche) & GAIIx 
(lllumina) 


21 


5 (24%) 


Simpson et al [9] 


RP 


45 


Solid phase customised 
capture array (NimbleGen) 


GAIIx (lllumina) 


5 


3 (60%) 


Coppieters et al [11] 


LCA 


16 


PCR amplicons 


GAIIx (lllumina) 


17 


3 (18%) 


Neveling et al [12] 


RP 


111 


Solid phase customised 
capture array (NimbleGen) 


454GS FLX Titanium 
(Roche) 


100 


36 (36%) 


Audo et al [10] 


RD 


254 


Liquid phase targeted 
SureSelect capture 
(Agilent) 


GAIIx (lllumina) 


13 


7 (54%) 


O'Sullivan et al [13] 


RD 


105 


Liquid phase targeted 
SureSelect capture 
(Agilent) 


SOLiD 4 

(Life Technologies) 


50 


21 (42%) 


Shanks et al [14] 


RP & CRD 


73 


Solid phase customised 
capture array (NimbleGen) 


454GS FLX Titanium 
(Roche) 


36 


9 (25%) 


Watson et al 
[This paper] 


RD 


162 (Retnet, 
July 2010) 


Liquid phase targeted 
SureSelect capture 
(Agilent) 


GAIIx (lllumina) 


20 


12 (60%) 



adRP = autosomal dominant retinitis pigmentosa; CRD = cone rod dystrophy; LCA = leber congenital amaurosis; 
RD = retinal dystrophies; RP = retinitis pigmentosa. 
doi:1 0.1 371/joumal.pone.OI 04281 .t005 



sequences cannot be adequately covered due to binding of the 
target DNA to multiple sites of repetitive sequence. In the current 
reagent, 9 exons including the RPGR ORF15 could not be 
covered because of repeat sequence, suggesting that these exons 
will have to be sequenced using alternative methods. In terms of 
data analysis, we observed a number of sequencing artefacts that 
may be due to low coverage, low sequence quality or the pooling 
of DNA samples but the most likely source was due to variant 
calling. In order to reduce the number of false negative results the 
stringency of variant calling algorithm was relaxed. This encom- 
passing approach to capture all possible variants inevitably meant 
that there were also a number of false positives in the annotated 
variant lists. 

The use of next generation sequencing for retinal disease 
diagnosis has been previously described (see Table 5). Researchers 
have used different target enrichment methods such as solid phase 
capture arrays [9,12,14] or PCR amplicons based approaches 
[8,1 1] as opposed to liquid phase capture [10,13] and have run the 
libraries on different machines such as the Roche 454 [8,12,14] or 
the ABI SOLiD [13] rather than the lllumina Genome Analyser 
[8-1 1]. Success in identifying the pathogenic mutation has, to date 
varied from 18% (3 out of 17 cases studied) [1 1] to 60% (3 out of 5 
cases studied) [9] and there does not appear to be any correlation 
between successfully identifying the pathogenic mutation and the 
library preparation method or machine used for the study. The 
approach described in this paper gave a 60% (12 out of 20 cases 
studied) success rate, which is higher than the majority of previous 
studies. One possible reason for this may be that we focussed on 
studying families with multiple affected members rather than 
single cases with no family history. This allowed us to assess the 
pathogenicity of candidate disease causing variants by following 
the transmission of the mutation with the disease phenotype. It is 
interesting to note when studying isolated cases that several 



examples of de novo mutations as the cause of disease have been 
demonstrated [12,14]. Another possible reason for the increased 
detection rate in this study is the high number of consanguineous 
cases in the local Yorkshire population, which allows filtering on 
the basis of homozygosity. 

Patient feedback has highlighted the need for, and perceived 
value of, a definitive diagnosis based on genetic testing, and has 
shown that patients are motivated by a variety of factors to seek 
genetic testing [32]. Individuals may see many different eye 
specialists before a definitive diagnosis is made, whereas genetic 
testing can rapidly provide an accurate diagnosis. Furthermore, a 
genetic diagnosis can confirm the way in which the condition is 
inherited, giving clearer estimates of risk for patients and their 
relatives thus informing family planning decisions. Genetic testing 
can also facilitate pre-implantation diagnosis or prenatal testing as 
well as carrier testing in those who wish to know. In some cases 
such information may lead to improvements in therapy or direct 
patients towards trials for new potential therapies. It can also 
provide patients with an accurate guide to future function. Using 
this information, individuals can make informed decisions 
regarding education, employment and lifestyle. 

To conclude, we report here that tagging DNA and pooling 
samples prior to hybridisation capture and next generation 
sequencing is a viable high throughput method for the genetic 
diagnosis of retinal dystrophies. This approach leaves a residual 
cohort of patients and families with retinal dystrophy that could 
not be resolved using the methods described. Their mutations may 
be in the known genes within regions that were not targeted such 
as the regulatory or intronic regions or one of the 9 exons of 
repetitive sequence. Alternatively, the mutation may be a cryptic 
splice site created by one of the synonymous variants that were 
removed during filtering. On the other hand, the mutation may be 
in one of the 66 additional genes that have been added to RetNet 
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since the capturing reagent was manufactured, or it may be in a 
new gene that has never been implicated in retinal dystrophy. 
Nevertheless, this cohort serves as a powerful resource for further 
gene and mutation discovery by whole exome as well as genome 
sequencing. 

Supporting Information 

File SI Supplementary figure and tables. 

(PDF) 
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